arXiv:1504.06745v2 [math.ST] 28 Mar 2016 


Extreme points of a ball about a measure with finite support 


Houman Owhadi and Clint Scovel 
California Institute of Technology 

March 29, 2016 


Abstract 

We show that, for the space of Borel probability nieasures on a Borel subset 
of a Polish metric space, the extreme points of the Prokhorov, Monge-Wasserstein 
and Kantorovich metric balls about a measure whose support has at most n points, 
consist of measures whose supports have at most n + 2 points. Moreover, we use the 
Strassen and Kantorovich-Rubinstein duality theorems to develop representations 
of supersets of the extreme points based on linear programming, and then develop 
these representations towards the goal of their efficient computation. 


1 Introduction 

In a recent work by Wozabal [20], a framework for optimization under ambiguity is 
developed -including a discussion of the history of the subject and the current literature. 
See also Dupacova [9] and the recent work by Esfahani and Kuhn [10], which expands 
Wozabal’s approach to develop convex reductions for an important class of objective 
functions. We quote from the abstract: “Though the true distribution is unknown, 
existence of a reference measure P enables the construction of non-parametric ambiguity 
sets as Kantorovich balls around P. The original stochastic optimization problems are 
robustified by a worst case approach with respect to these ambiguity sets.” Fundamental 
to the development of this framework, Wozabal [20, Cor. 1] asserts that, when the 
domain is a compact metric space, the extreme points of a Kantorovich ball about a 
measure whose support has at most n points consist of measures whose supports have 
at most n -|- 3 points. The purpose of this paper is to extend and sharpen this result; 
extending the domain from a compact metric space to a Borel subset of a Polish metric 
space, and improving the bound on the number of Dirac masses from re + 3 to re -|- 2. 
In addition, we provide similar results for the Prokhorov metric and for the Monge- 
Wasserstein distances. This increase in generality from a compact metric space to a 
Borel subset of a Polish space has two nontrivial components. The hrst is that it replaces 
compactness with separability. That is, since a compact metric space is complete, it 
amounts to a generalization from compact complete metric spaces to separable complete 
metric spaces. The second is that it replaces completeness with measurability. That is, 
it eliminates the completeness requirement and substitutes it with the requirement that 
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it be a Borel subset of separable complete metric space. For example, these results now 
apply to the case of probability measures on the (noncompact) open interval (0,1). 

To outline how they are obtained, recall Rogosinski’s Lemma [13], that on an arbi¬ 
trary measurable space, the n moments corresponding to the expected values of n inte- 
grable functions with respect to a probability measure can be achieved by a convex sum 
of n-|-1 Dirac masses. Moreover, recall that an exposed point of a convex set in a locally 
convex space is a point which is the unique maximizer of some continuous affine function, 
and Straszewicz [15] Theorem, that the exposed points of a finite dimensional compact 
convex set is dense in its extreme points. Wozabal uses the Kantorovich-Rubinstein 
Theorem combined with Rogosinski’s Lemma [13] to characterize the exposed points 
of the Kantorovich ball about a measure whose support has at most n points to be a 
measure with support at most n-|-3 points. The fact that one obtains n + 3 Dirac masses 
comes from the fact that Kantorovich-Rubinstein theorem introduces one function, the 
notion of an exposed point another, and the central measure having support of size n 
introduces n more functions, leading to a total of n -|- 2 continuous functions on the set 
of probability measures on X x X, so that Rogosinski’s Lemma implies that the exposed 
points are convex sums of (n -|- 2) -|- 1 = n -|- 3 Dirac masses. Then, Choquet’s [5, Sec. 17, 
pg. 99] extension of Straszewicz’ Theorem [15] to compact metrizable subsets of locally 
convex space along with the fact that the set of probability measures equipped with the 
weak topology is compact and metrizable when the domain is, is used to show that these 
exposed points are dense in the extreme points. A limiting argument showing that the 
weak limit of a convex sum of n -|- 3 Dirac masses is a convex sum of n -|- 3 Dirac masses 
establishes the assertion. 

In our approach, we use Dudley’s [8, Thm. 11.8.2] version of the Kantorovich- 
Rubinstein Theorem for tight measures on separable metric spaces, and characterize the 
extreme points of the space of measures corresponding to the Kantorovich-Rubinstein 
duality using results of Winkler [19, 18], previously applied in [12] to the reduction of 
optimization problems on non-compact spaces of tight probability measures arising in 
Uncertainty Quantihcation. Since, by Suslin’s Theorem, a Borel subset of a Polish space 
is Suslin and since all probability measures on Suslin spaces are tight, these results allow 
the extension of many results regarding the extreme points of sets of probability mea¬ 
sures from compact metric domains and continuous moment functions to Borel subsets 
of Polish metric spaces and measurable moment functions. Then a fundamental result 
that is implicit in the results of Winkler [19, 18] is proven in Theorem 2.2; that a weakly 
closed convex set of probability measures on a Borel subset of a Polish metric space has 
an extreme point. This result combined with Lemma 5.2, giving sufficient conditions 
that the affine image of the extreme points of a set cover the extreme points of the affine 
image of that set, shows that the image of these extreme points in the dual cover the 
extreme points of the Kantorovich ball. This latter approach has the advantage that 
it does not pass through the intermediate stage of exposed points, so does not add an 
additional function, and does not require a generalization of Straszewicz’ Theorem [15] 
to non-compact sets, although it does suggest that such a generalization may exist for 
weakly closed convex sets of tight measures. 
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To establish our main result, Theorem 2.1, we develop a more general and expres¬ 
sive result in Theorem 2.3, which not only produces a similar result for the Monge- 
Wasserstein metric, its Corollary 3.1 shows how the duality results of Kantorovich- 
Rubinstein and Strassen combined with the results of Winkler [19] on the extreme points 
of moment constraints, facilitate a Monge-Wasserstein linear programming representa¬ 
tion of supersets of the extreme points which can be used for convex maximization over 
the Kantorovich or Prokhorov ball about a measure whose support has at most n points. 
A stronger application of Winkler [19, Thm. 2.1] is then used to more fully develop these 
representations in Section 3 towards the goal of their efficient computation. Finally, in 
Section 4 we consider when the central measure is an empirical measure. 


2 Main Results 


For a metric space {X,d), the Prokhorov metric dpr on the space M{X) of Borel prob¬ 
ability measures is defined by 

dpr(pi,M 2 ) := inf {e :/ri(A) </i 2 (A^) + e, A € ;B(A)}, /ri, ^2 G A4(X), (2.1) 


where 


:= {x' G X : d{x, x') < e for some x G A}. 


According to Dudley [8, Thm. 11.3.3], when X is separable the Prokhorov metric metrizes 
weak convergence. Note that this deffnition produces the same metric if we were to use 
the “closed” inflated sets := {x' G X : d{x,x') < e for some x G A} instead. On 
the other hand, the Kantorovich distance dx on the space Xi{X) of Borel probability 
measures on a separable metric space X is deffned as follows, see Vershik [16] for a 
historical review: Let 


11/11l := sup 

Xl^X2 


|/(xi) - /(X 2 )l 
d(xi,X2) 


denote the Lipschitz norm of a real valued function on X. Then the Kantorovich distance 
is deffned by 

■= sup fd{iJ.i-H 2 )- (2.2) 

II/IIl<i4 


According to the remark after [8, Lem. 11.8.3], dx is an extended metric on A4{X). Let 
A„(A) C Ai{X) denote the set of probability measures whose supports have at most n 
points, and let ext (A) denote the set of extreme points of a set A. We can now state our 
result for the Prokhorov metric and Kantorovich extended metric. For either of these 
d := dx or d := dpr, for /r G Xi{X) we define B^{iJLri) '■= G M.{X) : d{^',n) < e}. 


Theorem 2.1. Let X be a Borel subset of a Polish metric space and consider the space 
Ai{X) of Borel probability measures equipped with the Prokhorov metric or the Kan¬ 
torovich extended metric. For n G N, e > 0 and Hn G A„(A), consider the ball B^{pLn) 
about the measure fin- Then 

exi{B^{fLn)) C A„+2(^) • 
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Our path to Theorem 2.1 requires the development of more useful results which we 
now describe. At the heart of the matter is a result of Winkler regarding the existence of 
extreme points of closed convex sets of probability measures that is implicit in the results 
of Winkler [18, 19]. Since this result is more modest than Winkler’s goal of developing 
integral representations, the proof we present appears somewhat simpler, in particular 
it is different in that it does not utilize Lusin’s Theorem. 


Theorem 2.2 (Winkler). Let X be a Borel subset of a Polish metric space and consider 
the set M.{X) of probability measures equipped with the weak topology. Then every 
nontrivial closed convex subset of A4{X) has an extreme point. 

Winkler’s Theorem 2.2 is fundamental in the proof of our second main result, the 
following Theorem 2.3, regarding the extreme points of the Monge-Wasserstein distance. 
This result combined with the duality results of Strassen and Kantorovich-Rubinstein are 
then used to establish Theorem 2.1. Moreover, in Section 3, Corollary 3.1 to Theorem 
2.3 establishes the main results to be used towards the computation of supersets of 
the extreme points ext(Re(/in)), useful for convex maximization, in particular linear 
programming, over the ball 

For any two probability measures pi,ii 2 £ A1(A), let M(/ii,^ 2 ) C A4(X x X) 
denote those probability measures with marginals and p, 2 - Then for a non-negative 
lower semicontinuous real-valued cost function c : A x A —>■ R, the Monge-Wasserstein 
distance dw on M.{X) is defined by 

dw{TiiT 2 )'-= inf / c{x,x')du{x,x'). 

Let Pi : Xi{X x A) ^ A1(A) denote the marginal map corresponding to the first com¬ 
ponent and P 2 the marginal map with respect to the second component. 

Theorem 2 . 3 . Let X be a Borel subset of a Polish metric space and c : A x A —>■ R 
a non-negative real-valued lower semicontinuous function. For n € N, e > 0 and fin € 
A„(A), consider the subset 

■= £ M{X X A) : Piu = Hn, j c{x, x')di'{x, x') < e} . 


Then 


ext(r^„,,) C A„+2(A X A) 


and 

P2(ext(r^^,,)) D ext(P2(r^„,,)) . 


In particular, we have 

ext(P2(r^„,,)) C An+2(A). 
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3 Computation of supersets 


Now we show how the duality results of Strassen and Kantorovich-Rubinstein combined 
with Theorem 2.3 can be used in the computation of supersets of the extreme points of 
To begin we introduce some terminology. We say that a set R is a superset for 
Be^n) if 

ex.i[B^{pLn)) C B C B^{pLn) ■ (3.3) 

For any function F which achieves its maximum at the extreme points, that is 

max F[pL) = max F(/^), 

it follows that 

max F(u) = maxFiu) 

for any superset B for B^{piri)- Consequently, efficiently constructed supersets facilitate 
the efficient solution to optimization problems over B^(fj,n)- To fix terms, we restrict our 
attention to the Prokhorov case, the Kantorovich case being essentially the same. For 
fixed e > 0 and fin € ‘^n, let us consider the Prokhorov ball B^{fin)- Then it is clear 
that since ext(i?£(/r„)) C B^{fin) we obtain from Theorem 2.1 that 

ext{B^{lIn)) C B^ifln) n An+ 2 iX) ■ 

Since moreover, ext[Bf:{fin)) C dB^{fin), where dB^{fin) := {fi € M.{X) : dpr{fi, fin) = 
e} is the sphere, we also conclude that 

ext{B^{fin)) C dB^ifin) n An+ 2 iX) . 

However, these supersets may be difficult to compute, so we look to Theorem 2.3 for sets 
generated by linear programming. To that end, write {d > e} for the subset of elements 
{x,y) £ X X X such that d{x, y) > e, and consider the subset F^^^^ C M.{X x X) dehned 
in the proof of Theorem 2.1 by 

:= £ M{X X X) : u{d > e} < e, Piv = /inj . 

The proof of Theorem 2.1 used Strassen’s Theorem to assert in (6.28) that 

Plij '= Bf:{fln) ■ 

Then Theorem 2.3 implies 

ext(F^^,J C An+2{X X X) (3.4) 

and the string of inequalities 

ext(H,(^n)) = ext(P2(r;,„,E)) 

C P2(ext(r^„,,)) 

C An+2{X) . 

Consequently, we obtain 
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Corollary 3.1. Consider the situation of Theorem 2.1 and the set defined in 

Theorem 2.3 by c := d in the Kantorovieh case and c := ld>e in the Prokhorov ease. 
Then we have 

ext{Bfijan)) C P 2 (ext(r^„,e)) C Bfifin) C ^n+2{X) 

ext(P,(/X„)) C P2{T^^^^nAn+2{X X X)) C Bfifln) n An+2{X) . 

The statement of Corollary 3.1 captures the mechanism by which we obtain the 
improvement from n + 3 to n + 2 Dirac masses in the description of the extreme points in 
Theorem 2.1. Indeed, since the set T^^^^ is a set of measures subject to n + 1 constraints, 
its extreme points are convex combination of n+2 Dirac masses on the product space X x 
X. Then the fact that the extreme points of Bfinn) consists of the convex combination 
of n + 2 Dirac masses follows from the fact that Corollary 3.1 implies that the projection 
onto the second component of these extreme points covers all the extreme points of 
Beilin): and the fact that projection of Dirac masses on X x X are Dirac masses on X. 
Corollary 3.1 also says that both 

n A„+ 2 (X X X)) and P 2 (ext(r^„,,)) 

are supersets for Bfinn)- Although the latter is smaller in that 

+2 (ext(r^„^,;)) C P2{j'pLn,e C A„_|_2(X X X)) , 

the computation of the former is useful in the computation of the latter, so we consider 
the computation of both. 

3.1 Computing Cl An+ 2 {X x X) 

Since, by (3.4), both ext(r^^_e) and r^^^£nA„_|_ 2 (XxX) are subsets of Pfi^yLnCAn+ 2 {Xx 
X), it will be convenient to compute Pfi^yL^ n Aji+ 2 (X x X) hrst. Let us proceed 
inductively, and assume that G A„(X) but is not in A„_i(X). Then /i„ := 
with fii > 0,yi € X,i = 1,.., n, fii = 1, and yi / yjfi / j. Fixing this y = (?/*) and 

(fii), we now dehne some subsets of Ad{X x X). For x G X™, n < m < n + 2, denote 

n 

hdyf.,xk ) 

k=l 


and let 

Ho :={5y,xX^XY- 
For i = 1, ..,n and x G X”+^, define 


Bi{x) 6y^x + ^yi,xi )) 0 < 7 < /3i} 


and 

n, := {n,(x),xGX”+H. 


(3.5) 


(3.6) 

(3.7) 
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Moreover, for x € and for i < j, define 

njj(x) := 6y^x + ^'yi{^yi,Xn+i ~ ^yi,xi) + lj{^yj,Xn+2 ~ )0 < 7i < 0 < 7i < /3j| 

(3.8) 

while for i = j, define 

:= 5y^x + ^l{5yi,Xn+l ~ + l2{5yi,Xn+2 ~ ^yi,Xi)^ 7l > 0) 72 > 0) 7i + 72 < /?i| 

(3.9) 

and then, for i < j, again take the union 

Ui^j:={Ui^j{x),x€X^+^}. (3.10) 

Lemma 3.2. In terms of the sets defined in (3.5), (3.7), and (3.10), we have 

Pf Vn n Xn+2{X X X) = Ho Hfc \Ji<j Uij . 

Using Lemma 3.2, we can now obtain an almost explicit representation of n 
Xn+ 2 {X X X), almost in the sense that it will amount to an explicitly represented set 
subject to the constraint of a single explicitly computable function. To that end, let us 
combine the definitions (3.5), (3.7), and (3.10) of Ho, H* and Iljj into one symbol with 
the introduction of a multiindex ^ that can take the values ^ = 0, i = i for i G {1, n}, or 
i = (i,j) with i < j. Then, in this notation nj(x) will denote no(x) and imply x G X” 
when ^ = 0, it will denote nj(x) and imply x G when i = i, and denote njj(x) 

and imply x G X"'^^ when i = (i,j). 

Since, in general, for v := ^k^xi^x'^, we have 


k'{d > e} 'y ^ ^k^d(x}^,x'il)>t 1 (^'H) 

k=l 

it follows that the function v i—?> v{d > e} restricted to A„_|_ 2 (X x X) is explicitly 
computable. Then, since 

r^^,,nA„+ 2 (XxX) = PfVnnA„+ 2 (AxX)n{z^ G M(XxX) : > e} < e} , (3.12) 

if we incorporate the constraint ^{d > e} < e by dehning 

tii{x) := nj(x) n G M{X X X) : v{d > e} < e} , (3.13) 

along with their unions H* over X”’, X”^^ and X”^^ respectively, then from the dis¬ 
tributive law of set theory. Lemma 3.2 and (3.12), we conclude that 

n a„+2(a X X) = Ho n^. Uj<j . (3.i4) 
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3.2 Computing ext(r^„,,) 

To compute ext(r^^^e) we use a stronger version of the characterization of the extreme 
points found in Winkler [19, Thm. 2.1] than we used in Theorem 2.3, along with the 
computation of H A„_|_ 2 (X x X) from Lemma 3.2. To that end, consider the 

constraint functions fi := ly^xX■,'i = l,--,n (where ly^xx{ 0 '■,b) = 1 if a = and 
lyixx(o, = 0 if o 7 ^ yi) and fn+i '■= ld>e- Then Winkler’s [19, Thm. 2.1] assertion 


m 

t(r^„,,) C ly = <m <n + 2,ai > 0,Xi,Xi e X,i = 

i=l 

the vectors (/i(xi, x(), ..., /n+i(xj, x(), l), z = 1,.., m are linearly independent| 


amounts to 


m 

ext(r^„,,) C |z^ € : v = < m < n + 2, a* > 0, x^, x' € X, i = l,..,m. 


2 = 1 


(3.15) 


the vectors (ly^(xj),..., ly^(xj), '^d(xi,x'.)>e^ 1)5 * = !> are linearly independentj . 


Since Theorem 2.3 asserts that ext(r^^^e) C A„+2(X x X), it follows that we can 
replace T^^^^ by T^^^e n A„+2(X x X) in the righthand side of ( 3 . 15 ). Having done so, 
let us define 


m 

0 := |i/ G n A„+ 2 (X -k X) :v = 3 ,/, 1 < m < n + 2, > 0, x*, x' € X, i = 1, ..,m. 


2=1 


the vectors (ly^(xi),..., ly„(xi), '^d{xi,x'^)>e-: l); * = 1) are linearly independent|. 


( 3 . 16 ) 


to be the righthand side of (3.15). Then we have 

ext(r^^^g) C 0 C 

and therefore 0 is a superset for To compute it, for i G {1, ..,n}, let us define 

Aj .— {x G X . ^d{yi,Xn+i)>e 7^ ^d{yi,Xi)>e} ■ (3-17) 

and for i < j define 

Ajj := {x G X ; ld{yi,Xn+i)>t '^d(yi,Xi)>ei ld(j/j,a;„+2)>e 7^ ^d(yj,Xj)>e} ■ (3.18) 

Lemma 3.3. With Aj defined in (3.17), Ajj defined in (3.18), and Hq,!!* and Hjj- 
defined in (3.13), we have 


0 = Ho U^i (Hj n Aj) Uj<j (Hjj n Aij). 
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Remark 3.4. For a reference measure /r := Yl'k=ih^yk-: is interesting to note that 
the condition that a measure 


^y,x + {'y{Syi,Xn+i ^yi,Xi) ) 0 < 7 < l^i} 

is a member of Ilj n Aj amounts to the splitting off of the mass I3i on the Dirac situated at 
Hi into the convex sum of two Dirac masses, one situated at Xi) and one at (y,, Xn+i), 
such that, between Xi and Xn+i, one is inside the ball of radius e about yi and the other 
is outside it. Moreover, to be a member of Iljj with i < j amounts to two such splits. 

3.3 Equivalence classes determined by the adjacency matrix 

For X G X™, n < m < n + 2, let its adjacency matrix A(x) be defined by 

A ^ (x) . '^d(yi,Xj)>ei ^ j l,..,7Tl. 

Commensurate with our introduction of the multiindex i, we use the expression A{x) to 
mean the n x m adjacency matrix when x G X™, for any m = re, n + l,n + 2. Since, 
by Lemma 3.3, 0 = Do Iljj' and the latter are determined by conditions 

Ai,i = l,..,re, Ajj for i < j, and iy{{z,z') G X x X : d{z,z') > e} < e, all of which, 
by the the evaluation (3.11), only depend on the values of the adjacency matrix, we 
obtain the following lemma. It asserts that, for any point in Llo, Ilj or Iljj, if the 
second components x of the Dirac masses are changed to x' with the same adjacency 
matrix, then the resulting sum of Dirac masses remains in Ho, Llj or Lljj respectively. 
Consequently, it will be useful in the efficient exploration of the set 0. 

Lemma 3.5. For re<rre<re + 2, xG X”^, z G X”* and a G M™, consider y{x) := 
Oik5zk,xk- If G nj(x), then for all x' such that A{x') = A(x), we have y{x') G 

n,(x'). 


4 Extreme points of a ball about an empirical measure 


Empirical measures take the form Hn = ^ YfA=i ^yi^ with y* G X, i = 1, ..,re. When all 
the points y* are unique, we can define fdi := I, i = 1, ..,re in the expressions of Section 
3, when the points have duplicates things will be more complicated. In the unique case, 
the definitions (3.5), (3.6), (3.8) and (3.9) of Ho, ni(x) and njj(x) take on a more 
symmetrical form, and since the case when the central measure is an empirical measure 
is an important application, we spell them out. To begin with, we have 




yk,xk ■ 


k=l 


Moreover, the evaluation of the constraint v{d > e) < e also takes a simpler form, so 
that constrained sets Ho, Ilj(x) and Iljj(x) appear as follows: 

fio = {<53,,X, X G X*^} 
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subject to the constraint 




k=l 


while for i G {1, n} we have 


nj(x) — 6y^x + ^{l{^yi,Xn+i ^yi,xi )) 0 < 7 < 1 } 


subject to the constraint 


'^d{yu,Xk)>e + l{^d{yi,x„+l)><^ '^d{yi,Xi)>e) < ^ ) 

k=l 

and for i < j we have 

nij(x) = 5y^x + “ ^yi,Xi) lj{^yj,Xn+2 ~ ^yj,Xj) ) 0 < 7j < 1, 0 < 7j < 1 j 

subject to the constraint 
1 


n 


+ 7*(l(i(i/i,x„+i)>e '^d{yi,Xi)>e) 1 j{'^d{yj ,Xn+2)>f- '^d(yj ,Xj)>e) ^ 


A:=l 

and for i = j we have 


ni,i(x) = 5y,x + [ll{5yi,X^+^ - 5y^^Xi) + l2{5yi,x^+^ - 5y^,x^), 71 > 0,72 > 0,71+72 < l} 


subject to the constraint 


1 


^ l(l(yfc,Xfe)>e + 7l(ld(3/i,i„+i)>e l(l(yi,a;i)>e) + 72(l(l(yi,x„+2)>e ^d(yi,Xi)>e) ^ 


k=l 


5 Appendix 

5.1 Extreme subsets 

We begin by establishing a fundamental identity regarding the extreme subsets of ex¬ 
treme subsets^ of an affine space. Since this terminology varies in the literature, we fix 
it now. Following [2, Def. 7.61], we say that a set E is an extreme subset of a subset 
A C F of a real linear space L if E C A and 9x + {1 — 6)y € E with x,y ^ A, 6 ^ (0,1), 
implies that x,y ^ E. Note that this definition does not require convexity. An extreme 
point of A is an extreme subset of A consisting of a single point. We say that a set F 
is a face of a subset A C L of a real linear space L if it is a convex extreme subset of 
A. The following lemma implies that Simon [14, Prop. 8.6] is valid without assuming 
compactness or convexity. 

^ the repetition here is not a typo 
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Lemma 5.1. Let A be a subset of a real linear space L and let E be an extreme subset 
of A. Then B is an extreme subset of E if and only if B d E and it is an extreme subset 
of A. In particular, 

eyii{E) = E r\ ext (A). 

Proof. The proof is identical to that of [14, Prop. 8.6], but we reproduce it here so that 
the reader can confirm that it is valid without compactness or convexity assumptions. 
First suppose that B <Z E and B is an extreme subset of A. Then, by dehnition, if 
9x + {1 — 9)y € B, with x,y ^ A, 9 ^ (0,1), then x,y £ B. Since E C A, it follows that 
if we have 9x + {l — 9)y € B, withx,y £ E,9 £ (0,1), that x,y £ B. Consequently, since 
B C E, B IS an extreme subset of E. Now assume that B is an extreme subset of E. 
Then, if we have 9x + {1 — 9)y £ B, with x,y £ A,9 £ (0,1), the fact that B <Z E and 
E is an extreme subset of A implies that x,y £ E. Then, since B is an extreme subset 
of E, it follows that x,y £ B. Since clearly B C A, we conclude that B is an extreme 
subset oi A. □ 

5.2 AfRne images of extreme points 

Here we establish a fundamental result for affine transformations and extreme points of, 
possibly non-convex, subsets. 

Lemma 5.2. Let L and L' be real linear spaces and K <Z L a subset. Suppose that 
G : K ^ L' is the restriction of an affine transformation G : L ^ L' to K such that 
ext(G“^(A:')) 7 ^ 0 for all k' £ ext(G(if)). Then G(ext(if)) D ext(G(iir)). 

Proof. Let k' £ ext(G(K)) and consider any point k £ G~^(k'). Then if k = 9ki + (1 — 
9)k2, with ki,k 2 £ K, 9 £ (0,1), then k' = G{k) = G{9ki + (1 — 9)k2) = 9G{ki) + (1 — 
9)G{k2), so that, since k' is an extreme point, it follows that G{ki) = G{k 2 ) = G{k). 
That is, G~^{k') is an extreme subset of K. Therefore, Lemma 5.1 implies that 

ext(G“^(A:')) = G~^{k') n ext(A), 

so that any extreme point of G~^{k') is an extreme point of K. Since, by assumption, 
G~^{k') has an extreme point, it follows that any such extreme point is an extreme point 
of K. Since the image under G of any such point is k', and k' £ ext(G(K)) was arbitrary, 
the assertion follows. □ 

5.3 Integrals of extended real-valned lower semicontinuous fnnctions 

Here we formulate a generalization to extended real-valued functions of [2, Thm. 15.5], 
that the integral of a bounded lower semicontinuous function forms a lower semicontin¬ 
uous function in the weak topology. 

Lemma 5.3. Let (X, d) be a metric space and f : X ^ a nonnegative lower semi¬ 
continuous extended real-valued function. For p, £ A4(X) define f fdp to be the integral 
if f is p-integrable and 00 if it is not. Then the function E : M.{X) R defined by 
F{p) := f fdp is lower semicontinuous in the weak topology. 
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Proof. We follow Aliprantis and Border [2, Thm. 15.5]. First let us clip the function 
/ at the level s by f^{x) := min (/(x), s), x G X. Then since for all c we have {x : 
/^(x) < c} = {x : /(x) < c} for s > c and {x : /^(x) < c} = {x : /(x) < s} for 
s < c it follows that is a real-valued semicontinuous function. Consequently, by [2, 
Thm. 3.13] for each s, is the increasing pointwise limit of a sequence ff of Lipschitz 
continuous functions. By further clipping from below at 0, sending i-)- max(/^,0) 
we obtain that we can assume that for each s, /* is the increasing pointwise limit of a 
sequence of nonnegative bounded continuous functions. Therefore, setting s := n and 
dehning /„ := /^, we conclude that / is the increasing pointwise limit of a sequence fn 
of bounded continuous nonnegative real-valued functions. 

Now let fia be a net such that /Tq, —> /r in the weak topology and let us utilize the 
integration theory for extended real-valued functions as found in Ash [3, Sec. 1]. Then 
it follows that 

J fndHa ^ j fndfl (5.19) 

and 

j fnd^la < j fdHa (5.20) 


SO that we conclude that 


/ fndfJ, < liminf 



for each n. Therefore, from the monotone convergence theorem for extended valued 
functions, see e.g. Ash [3, 1.6.2], we have 



lim 

n^oo 


fndu 


and we conclude that 




< liminf 

OL 



so that the assertion follows from the alternative characterization of lower semicontinuous 
extended real-valued functions [2, Lem. 2.42]. □ 


6 Proofs 

6.1 Proof of Theorem 2.2 

We follow the proof of the main result in [18], simplifying it according to our more 
modest goal. Let t denote the topology of X. Since X is a Borel subset of a Polish 
space, it follows that it is Suslin and therefore all finite Borel measures on (A, t) are 
tight. Let C C Xt{X) he a nontrivial closed convex subset and consider fi* £ C. Since 
fj,* is tight, using a recursive argument, we obtain a sequence iL„ C A, n G N of disjoint 
compact subsets such that if we define Ai := UneNl^n we have fj,*{Xi) = 1. Let the 
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relative topology of the subspace Xi d Xhe denoted by to and introduce a finer topology 
ti D to defined by ^ G ti if, for every n G N, we have AdKn = B^AKn for some G t. 
It follows that Kn G ti for all n G N, so that {Xi,ti) is locally compact. Moreover, since 
(Xi,to) is metric, it is Hausdorff, and since ti is finer than to it follows that (Xi,ti) is 
Hausdorff. Let us show that (Xi,ti) is also completely regular. To that end, recall, see 
e.g. Willard [17, Thm. 14.12], that a space is completely regular if and only if its topology 
is the initial topology corresponding to the bounded continuous functions. Since (Xi,to) 
is metric it is completely regular. Consequently the topology ti amounts to the initial 
topology corresponding to the addition of the set of indicator functions lx„,n G N to 
the collection of continuous functions on (Xi,to)- Therefore, (Xi,fi) is also completely 
regular. Since {X,t) is Suslin it is second countable and therefore (Xi,fo) is second 
countable. Since a base for the topology ti can be constructed by taking a base for 
(Xi,fo) and taking all intersections with the sets Kn,n G N, it follows that {X\,ti) is 
second countable. Consequently, all the spaces {X,t), {Xi,to) and (Xi,ti) are second 
countable. 

Now observe that for A G ti we have A = C Kn and for each n, we have 

A n Kn = Bn n Kn for some Bn G t. Since both Bn and Kn are in B{t) it follows that 
the intersection is also and therefore also the countable union A = U^gN^ C Kn- That 
is, A G B{t) and since A d Xi \t follows that A G B{to). Since ti is finer than to, we 
conclude that 

B{to) = B{h) 

and therefore 

M{Xi,to) = M{Xi,h) (6.21) 

as sets. 

Since {Xi,ti) is locally compact and Hausdorff, we consider the Alexandroff one- 
point compactification (^ 2 ,^ 2 ) of (Ai,fi). Since (Ai,fi) is second countable, it fol¬ 
lows, see e.g. [2, Thm. 3.44], that the compactification (X 2 ,t 2 ) is metrizable. Conse¬ 
quently, {X 2 ,t 2 ) is a compact metrizable Hausdorff space, and so it follows, see e.g. [2, 
Thm. 15.11], that A4{X2,t2) is compact and metrizable. Moreover, since by e.g. [2, 
Lem. 3.26 & Thm. 3.28], all compact metrizable spaces are separable and therefore 
second countable, it follows that A4{X2,t2) is second countable. 

Define 

= {^ G M{X,t) : ^(Ai) = 1} 

MxAX2,t2) = {/^ G M(X2,t2) : ^(Ai) = 1} 

where Ai C A 2 is the subset identification corresponding to the compactihcation. 
Since both Xi{X,t) and M.{X 2 ,t 2 ) are second countable, it follows that the subspaces 
M.Xi{X,t) and AfjVi(A 2 ,f 2 ) are second countable. Since (A 2 ,t 2 ) is compact and Haus¬ 
dorff it follows from [17, Thm. 17.10 & Cor. 15.7] that (A 2 ,f 2 ) is completely regular. 
Consequently, if we let 

• (-^1) fo) 

• (-^1) ^1) 
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(A,f) 

{X2,t2) 


denote the two subset injections, then since both {Xi,to) and {X 2 ,t 2 ) are completely 
regular, Bourbaki [4, Prop. 8 , Sec. 5.3] implies that the pushforward maps 


: M{Xuto) ^ MxAX,t), 
il : M{Xi,ti) ^ MxAX2,t2), 

are homeomorphisms. Because of the identity (6.21) it is natural to define 

i : Mxi {X, t) Mxi (X2,t2) 


by 




On-I 


Although each component and i\ of t is a homeomorphism, since we have Al(Xi, to) = 
Al(Ai,ti) only as sets, i may not be a homeomorphism. However, since ti is finer than 
to it follows that the identity map C : Al(Ai,ti) Al(Xi,to) is continuous, and if we 
more properly write 


as a composition of three maps on topological spaces, it follows from the continuity of L 
and the fact that and i\ are homeomorphisms, that 


t is a closed map . 


( 6 . 22 ) 


Now define 

Co ■.= CnMx,{X,t) 

C2 := iCo 

and 

6*2 := the closure of 6*2 in A 4 (X 2 , t 2 ). 

Since i is affine it follows that C 2 is convex. Moreover, since Co is relatively closed 
in M.Xi{X,t) and by ( 6 . 22 ) i is a closed map, it follows that C 2 = tCo is relatively 
closed in Alxi(-^ 2 , ^ 2 )- Consequently, there exists a closed set C 2 C A4{X2,t2) such 
that (72 = 6*2 n M.Xi{X 2 ,t 2 )- Since it follows that C 2 D C 2 we obtain 

C 2 CC 2 C C 2 


and therefore 


C 2 = C2^~^ Mxi{X2,t2) 

C C2f^ Mxi{X2,t2) 

C (^2 n A4xi(Ar2,t2) 

= C 2 

so that we conclude that 

C 2 = € 2 ^ Mxi{X2,t2) ■ (6.23) 
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It is easy to show that both A4xi{X,t) C A4{X,t) and A4xi{X2,t2) C A4{X2,t2) 
are extreme subsets. Therefore, it follows from Lemma 5.2 that 

ext(C'o) = ext(C') n M.Xi {X, t) (6.24) 

and 

ext(C 2 ) = ext(C' 2 ) r\ Mxi(X2,t2) ■ (6.25) 

Since t is a composition of affine bijections, it is an affine bijection, so that we have 

ext(C 2 ) = text (Co). 

Finally, observe that /j.*, selected at the beginning of the proof, satisfies /r* € 
M.Xi{X,t). Therefore it follows that Cq and therefore C 2 ■= i-Cq and C 2 are not empty. 
Consequently, since C2 C M.{X 2 ,t 2 ) is closed and A4{X2,t2) compact it follows that 
C 2 is compact, and since Ai{X 2 ,t 2 ) is locally convex and metrizable, it follows from 
Choquet’s Theorem for metrizable compact convex sets, see Alfsen [1, Cor. 1.4.9], that 
each element /r G C 2 has an integral representation over the boundary ext(C 2 ). That 
is, ext(C 2 ) 7 ^ 0 is measurable, and for /r G C 2 there exists a probability measure p on 
ext(C 2 ) such that for all continuous functions / on C 2 , we have 

= [ _ v{f)dp{v). 

J ext(C2) 

where /u(/) and i/(/) denote the integrals f fdp and f fdv. 

Consider the open subset Xi C X 2 . Since Xi is a metric space, it follows, see 
e.g. [2, Cor. 3.14], that the indicator function Ixi is the increasing pointwise limit of 
a sequence of continuous functions /„,n G N with values in [0,1]. Since C 2 is a subset 
of a metrizable second countable space, it too is metrizable and second countable, and 
therefore it follows from [2, Lem. 3.4] that it is separable. Consequently, [2, Thm. 15.13] 
implies that the function v 1 -^ z^(/) is measurable for all bounded measurable functions 
/. Therefore, by the monotone convergence theorem [3, Thm. 1.6.2] applied three times: 
to the left hand side, to the integrand of the righthand side, and to the integral on the 
righthand side, we conclude that 

p{Xi) = [ _ v{Xi)dp{v ). (6.26) 

J ext(C2) 

Since C 2 C C 2 , it follows that p € C 2 has a representing measure p such that integral 
formula (6.26) holds. Since p ^ C 2 , the equality p{Xi) = 1 implies that i^{Xi) = 1 
p-almost everywhere. In particular, there exists a v & C 2 such that I'iXi) = 1. That 
is, ext(C' 2 ) Mxi{X 2 ,t 2 ) / 0. Since by (6.25) ext(C' 2 ) = ext(C 2 ) Mxi{X 2 ,t 2 ) it 
follows that ext(C' 2 ) 7 ^ 0. Furthermore, the relation iext(Co) = ext(C' 2 ) implies that 
ext(C'o) 7 ^ and the relation ext(Co) = ext(C') n A4xi{X,t) implies that ext(C') 7 ^ 0, 
which is the assertion of the theorem. 
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6.2 Proof of Theorem 2.3 

It is straightforward to show that X x X is a Borel subset of the Polish metric space 
determined by the product of the ambient Polish metric spaces. Therefore, Suslin’s 
Theorem, see e.g. Kechris [11, Thm. 14.2], implies that both X and X x X are Suslin, 
and therefore by Dellacherie and Meyer [6, III.69], it follows that all probability measures 
in both A4{X) and Xi{X x X) are tight. This tightness facilitates both the existence 
of extreme points for convex sets of measures, useful in obtaining the assertion, and the 
duality theorems of Strassen and Kantorovich-Rubinstein used in the proof of Theorem 
2 . 1 . 

Lemma 5.3 implies that {z^ € M.[X x X) : f c(x, x')di^(x, x') < e} is closed and 
convex in the weak topology. Moreover, by Aliprantis and Border [2, Thm. 15.14] the 
marginal maps Pi and P 2 are continuous in the weak topologies. Since singletons in 
M(X) are closed, for /r G A4(X), it follows that {v G A4{X x X) : Piv = /i„}, 
{ly € Xi(X X X) : P 2 U = /r} are also closed and convex, and therefore n P^^^i 
is closed and convex in the weak topology. Since H is nonemtpy, Winkler’s 

Theorem 2.2 implies that it possesses an extreme point. Therefore Lemma 5.2 implies 
that 

P2(ext(r^^,e)) A ext(P2(r;,„,,)) , 

establishing the second assertion. 

For the first, let us describe ext(r^^^e). To that end, write fin = with 

Oil > 0,Xi £ X,i = 1 ) ■■,n and YH=i Then consider the n + 1 constraint functions 

c and l{xi}xxP = to define T^^^^ as inequality/equality constraints defined by 

integrals of measurable functions on A4{X x X). Then [12, Thm. 4.1, Rmk. 4.2] (derived 
from Winkler [19, Thm. 2.1], which is a consequence of Dubins [7]) implies that 

ext(r^^,£) C An+2{X X X ), 

establishing the first assertion. The third assertion follows by combining the first two 
and P 2 {An+ 2 iX X X)) = An+ 2 {X). 

6.3 Proof of Theorem 2.1 

Since A is a Borel subset in a Polish metric space, Suslin’s Theorem, see e.g. Kechris 
[11, Thm. 14.2], implies that X is Suslin, and therefore by Dellacherie and Meyer [6, 
111.69], it follows that all probability measures in Ai{X) are tight. 

Let us first begin with the Prokhorov case. We use the Prokhorov metric on Xi(X x 
X). Consider the subset C Xi(X x X) defined by 

■= € M(X X X) : u{d > e} < e, Piv = /inj . 

For any v G F^^^g, for fi' := P2I' it follows that Piu = fin, P 2 ^ = fJ-' and iy{d > e} < e, so 
that by the Prokhorov-Ky Fan inequality [8, Thm. 11.3.5] it follows that dpr{fi', fin) < 
that is fi' G Bf:{fLn), so that we conclude that 

P2{r^^,e)cBM. (6.27) 
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To obtain the reverse inequality, let us first note that the inf in the dehnition (2.1) of 
the Prokhorov metric can be replaced by a min. To see this, observe that for fixed A E 
13{X), that the parametrized family of open sets A‘^,e > 0 is increasing. Consequently, 
if en 4- then for any // E A4(X) we have fj.^A'^'), so that, for fixed A E 13(X) 

and /ii ,^2 € M.{X), the interval {e : fJ-i{A) < + e} is closed. It follows that 

the intersection of these closed intervals {e ; fJ-i{A) < + e, A G B{X)} over all 

A E B{X) is closed. Therefore the infimum in the definition (2.1) is attained. 

Now consider ^ E B^{ijLri) and define e* := dpr{fin, fJ-)- Then by the previous remark 
we have 

fx{A)<fin{A^*) + e*, Ael3{X) 

and the inequality e* < e implies that 

fJ-{A) < fj,n{A^) + e, E B(X). 

Moreover, if we denote d{x, A) := inf^g^ d{x, y) then it is easy to see that A'^ = {x ^ 
X : d{x, ^4) < e} and defining A^^ = {x E X : d{x, A) < e} we obtain that 

y{A) < fin{A^^) + e, A € B{X). 

Then, since both y and fin are tight, Dudley’s [8, Thm. 11.6.2] extension of Strassen’s 
Theorem to tight measures on separable metric spaces implies that there exists a prob¬ 
ability measure u E M.{X x X) such that Piu = fin, P 2 T^ = fJ- and ii{d > e} < e, that is, 
there exists a E T^^^^ such that P 2 ii = fi, so that we obtain 

-P2(r^„,e) D B^{fln) 


and, so by (6.27), conclude that 


-P 2 — i?e(//n) • (6.28) 

Since the metric d is a continuous function, it follows that the set {(x, x^) E X x X : 
d(x, x') > e} is open and therefore the indicator function is lower semicontinuous. 
Therefore, we can apply Theorem 2.3 to obtain 

ex.t{B^{fin)) = ext(P 2 (r^„,,)) 

C A„+2(X) 


establishing the assertion. 

Now let us consider the Kantorovich case. To that end, let Xli(X) C M.{X) denote 
those Borel probability measures y such that f d{x', x)dfi{x) < oo for some x' E X, and 
consider the Monge-Wasserstein distance dw on Xli(X) defined by 

dw{pi,y 2 )'-= inf / d{x,x')dii{x,x'). 

i'eM{pip2) J 
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Then the Kantorovich-Rubinstein Theorem [8, Thm. 11.8.2] states that for all ^ 1,^2 € 
M.i{X) we have 

dxifJ-i, fJ‘2) = dwifJ-i, ^J■2) , 

and if ^1 and fi 2 are tight, that there is a measure in Xi{X x X) at which the inhmum 
in the definition of dw is attained. 

Define C M{X x X) by 

:= |i^ G M{X X X) : J d{x,x')du{x,x') < e, Piu = /inj , 


and for u € consider /r := P 2 i'. Then, for y £ X, we have 

< I {d(y..) + 

= j d{y,x)dfinix) + J d{x,x')di 2 {x,x') 

< J d{y,x)dfiriix) + e. 


and since /i„ is a finite convex sum of Dirac masses, it follows that f d{y, x')dy{x') < 00 , 
that is, P 2 V G M.i{X), so that we conclude that 


P2(r;.„,.)c>fi(x). 


Since all measures in Mi{X) are tight, the Kantorovich-Rubinstein Theorem then 
implies that 

P2iX ^In,e) = BeiiJ'n) 

in the same way that the Strassen Theorem implied it in (6.28) for the Prokhorov metric. 
Moreover, since d is a metric, it is non-negative, real-valued and continuous, so it follows 
that it is a non-negative semicontinuous real-valued function. As in the Prokhorov case. 
Theorem 2.3 then yields the assertion. 


6.4 Proof of Lemma 3.2 

Since an element 12 G A„_|_ 2 (A x X) may have support smaller than n -|- 2, we represent 
it by C(i > 0,Xi,x' £ X,i = 1,.., m, X)™ 1 “i = 1, for m < n2, 

where we also require 7 ^ (xj,x' ),f 7 ^ j. Such an element 12 £ A„+ 2 (-^ x X) is a 

member of P^^Hn H A„+ 2 (Ai x X) if and only if Piiy = fin- Therefore, we conclude that 
12 £ Pi^yn n A„+ 2 (A X a) if and only if 

m n 

Ididyi ■ 

j=l i=l 
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Since /3i > 0, i = 1,n and Uj > 0, j = 1,m it follows that 

{xj,j = = {yi,i = 

In particular, m must satisfy n < m < n + 2. Moreover, the three possible cases 
m = n, n + l,n + 2 appear as follows: when m = n, there is a relabeling of the indices 
of {xj,Xj),j = l,..,n so that Xi = yi,ai = Pi, i = When m = n + 1, there is 

a ji G { 1 ,.. ,n} and a relabeling so that Xi = yi,i = l,..,n and Xn+i = yji- Then we 
also have = Pi,i ^ ji and + Un+i = Pji- When m = n + 2, then there is a 
relabeling so that Xi = yi,i = l,..,n and either 1 ) there is a ji € {l,..,n} such that 
Xn+i = Xx +2 = Vji and at = Pi, i ji and + an+i + an +2 = Pji or 2 ) there are two 
distinct values ji,j 2 G { 1 , ■■,n} such that Xn+i = Vji, Xn +2 = yj 2 , = PiP 7^ ji i 7 ^ J 2 , 
ctji + ctn+i = Pji ) and + Q;n +2 = Pj 2 ■ It is clear the the m = n case amounts to the 
statement € Ho defined in (3.5). Let us now show that the m = n + 1 and m = n + 2 
cases amount to the statements u £ Hi for some i and u € Ilij for some i < j, defined 
in (3.7), and (3.10) respectively, establishing the assertion. 

To that end, for the m = n + 1 case, the above assertion states that there is an 
i G {1,.., n} and an x G such that 

^ ~ Pk5yk,Xk T ^Pyi,Xi + 0^n+l^yi,Xn+i 

/c 7 ^ 2 ,/cG{l,n} 

with Oj + an+i = Pi- Since 

l^k^yk,Xk + OiPyi,Xi + 0;n+l'^yi,x„+i = ^y,x + (o<i — Pi)^yi,xi + 0'n+l^yi,Xr,+i 

k^i,kG{l,n} 

= ^y,x + Oin+l(^yi,Xn+i — ^yi,Xi) , 

by the identification 7 := Un+i, we conclude that G Ilj defined in (3.7). The proof in 
the m = n + 2 case is essentially the same. 


6.5 Proof of Lemma 3.3 


Let us define 

771 

0 := 1^ G Pp^jjLn n A„_|_ 2 (-^ X X) : u = ai5x^^x£ 1 < m < n + 2, a* > 0, Xj, x' G X, i = 1,.., m, 

i=l 


the vectors {ly^xp ,..., ly„(xi), '^d{xi,x'.)>ei l)) * = 1; --^kn are linearly independent|. 


(6.29) 


Then the identity 

= PfVn n {z^ G M{X X X) : u{d > e} < e} 

implies that 

0 = 0 n {zy G M{X X X) : v{d > e} < e} . 


(6.30) 
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As in Section 3.1, let us compute © by first computing 0 and then using the identity 
(6.30). To that end, observe that the definition (6.29) of 0 implies that the support 
points = 1, ..,m contain no duplicates so that we can apply Lemma 3.2 which 

implies that we can constrain the values of m in the definition of0 ton<m<n + 2. 
Moreover, 0 is defined in terms of H A„_|_ 2 (X x X), and by Lemma 3.2 we have 

Pf Vn n ^ ■^) — ffo flfc ‘ ConseQuently, using the multnndex % 

introduced above (3.13), it is natural to dehne 

0, := 0nn* 

and observe that 

0 = 00 U^=i 0fc Ui<j 0jj. 

First consider 0o. Since the definition of Ho implies that {xj,j = 1, ..,n} must be 
a permutation of {yi,i = l,..,n}, it follows that the linear independence condition of 
(6.29) is satished in this case. That is, 

©0 = IIq . (6.31) 

Now consider 11* for i G {l,..,n}. Then the definition (3.7) of 11* implies that, upon 
relabeling, that the linear independence of the set (ij^j (x*),..., \y^ (x*), '^d{xi,x'.)>e: l) A = 
1,.., n + 1 amounts to the linear independence of the set 

i^ny.nt 

together with 

^0, .., Ij, .., 0, 5 

where Zn has components '^d{yip)>tP = Inxn is the identity matrix, is the 

vector of Is, and 1* indicates a 1 in the z-th position. Because the hrst row has the 
identity matrix, this set of vectors is linearly independent if and only if 

^0, .., 1*, .., 0, ^d{yi,x^)>e ) 

^0, .., 1*, .., 0, 

is linearly independent, which is equivalent to the assertion that x' G A* defined in (3.17). 
Consequently, we obtain 

©* = n* n A*. (6.32) 

For 0jj with i < j, let us first show that ©j^* = 0. To that end, let x' G and 

consider v G n*^j(x'). Then using the same reasoning as above, it follows that the linear 
independence condition is equivalent to the linear independence of the three vectors 

^0, .., Ij, .., 0, ^d{yi,x'^)>e 

^0, .., Ij, .., 0, '^d{yi,x'^^^)>e 

^0, .., Ij, .., 0, '^d{yi,x'^^^)>e 
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Since the last row is identically 1, the independence of this set is not possible regardless 
of the values of ld{yi,xr)>e, and ^d{yi,x'^^^)>e- Therefore, 


CD 

II 

z = 1,.., n . (6.33) 

So let us consider Qij with i < j. Then, upon relabeling, the linear independence of the 
set (ly^(xj),..., ly^(xi), 1 ^( 3 ,. l), z = l,..,n -|-2 amounts to the linear independence 

of the set 



together with 


(0,..,R,..,0,..,0, 


(o,..,o,..,ij,..,o, 


Because the first row has the identity matrix, this set of vectors is linearly independent 
if and only if both 

(0,..,R,..,0, 

^d(yi,x'-)>e t 1 ) 

(o,..,T,.., 0 , 


and 


(o,..,ij,..,o. 

^d(yj,x^)>€ ) 1 ) 

(o,..,ij,..,o, 

^d(yj,x'^^2)><^ 


are linearly independent. Then, as in the 0* case above, the linear independence of these 
two sets is equivalent to requiring that x' € Ajj defined in (3.18). That is, we have 


02,J 0 * 

Therefore, we have established that 

0 = Ho (Hi n Ai) Ui<j (Hij n Aij ), 

and the assertion then easily follows. 


(6.34) 
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